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Profile-based optimizations: Reality-based optimization 
Scott McFarling 

iviarch 2003 Proceedings of the international symposium on Code generation and i 
runtime optimization 

Full text available:f3 p(jf(-j ,09 fysgj g Publisher Ske Additional Information :fun citation, sbs 

Profile-based optimization has been studied extensively. Numerous papers ani 
improvements. However, most of these papers have been limited to either brc 
performance. Also, most of these papers have looked at small applications wit 
training scenarios. In this paper, we look at real use of large real-world deskto 
consumption and disk performance are the primar ... 



2 Systematic Power-Performance Trade-Off in MPEG-4 by Means of Selecti\ 
Address Optimization Opportunities 
N. Palkovic, M.. Miranda, F. Catthoor 

March 2002 Proceedings of the conference on Design, automation and test in 

Full text availabler'g ptH\1 1 6,36 K8) W Pubiisiier Site Additional Informatic 

The hierarchical structure of real-life data dominatedapplications limits the ex 
optimisations.This limitation is often overcome by func-tioninlining. However, 
which causes a significant growth of instruction cachemisses and thus perforrr 
confirmed on experiments with our applications. We have developed a novel r 
inlining steered by cost/gain balance to trade-off ... 
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3 Near-optimal intraprocedural branch alignment 

Cliff Young, David S. Johnson, (Michael D. Smith, David R. Karger 
May 1997 aci^I SIGPI_AN Notices , Proceedings of the ACM SIGPLAN 1997 conferei 
and implementation. Volume 32 Issue 5 

Full text available:"® pdf(1.56 MB) Additional lnformation:ftiil citalton, abstrfict, refererjcos, c,i 

Branch alignment reorders the basic blocks of a program to minimize pipeline 
instructions. Prior work in branch alignment has produced useful heuristic met 
algorithm that usually achieves the minimum possible pipeline penalty and or 
of a provable optimum. We compare the control penalties and running times c 
approach and observe that both the greedy method an ... 

4 Temperature-aware microarchitecture: IVIodeling and implementation 
Kevin Skadron, Mircea R. Stan, Karthik Sankaranarayanan, Wei Huang, Sivakurr 
March 2004 ACM Transactions on Architecture and Code Optimization (TACO), 
Full text available:® pdr(1 .42 MB) Additional Information: full ciiation, abstracl, references, cit; 

With cooling costs rising exponentially, designing cooling solutions for worst-c 
expensive. Chips that can autonomously modify their execution and power-di« 
of lower-cost cooling solutions while still guaranteeing safe temperature regul 
dynamic thermal management (DTM), however, requires a thermal model tha 
studies. This paper describes HotSpo ... 

Keywords: Dynamic compact thermal models, dynamic thermal management, 
control, fetch gating 



5 TRIPS: A polymorphous architecture for exploiting ILP, TLP, and DLP 
Karthikeyan Sankaralingam, Ramadass Nagarajan, Maiming Liu, Changkyu Kim, 
Burger, Stephen W. Keckler, Robert G. McDonald, Charles R. Moore 
March 2004 ACM Transactions on Architecture and Code Optimization (TACO), 
Full text avai|able:1l pdf(632.30 KB) / Additional-Information : fulf ciiation, abstract, referent 

This paper describes the polymorphous TRIPS architecture that can be configu 
of parallelism. The TRIPS architecture is the first in a class of post-RISC, data 
data-graph execution (EDGE). This EDGE ISA is coupled with hardware mechc 
and the on-chip memory system to be configured and combined in different m 
thread-level parallelism. To adapt ... 

Keywords: Computer architecture, configurable computing, scalable and high- 
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6 Papers: The lookahead strategy for distance-based location tracking in wir€ 

I-Fei Tsai, Rong-Hong Jan 

October 1999 aCM SIGMOBILE Mobile Computing and Communications Review, 

Full text available:!! pdf(1 .27 MB) Additional Information: full citation, abfitract, referen 

Based on a multi-scale, straight-oriented mobility model, this paper presents 
location tracking so the rate of location update can be reduced without incurri 
linear mobility graphs, the optimal registered cell is found by an iterative algc 
maximized. For planar mobility graphs, the authors employ the results from li 
registered cell. Performance gain i ... 

7 Aggressive inlining 

Andrew Ayers, Richard Schooler, Robert Gottlieb 

May 1997 ACM SIGPLAN Notices , Proceedings of the ACM SIGPLAN 1997 conferei 
and implementation. Volume 32 Issue 5 

Full text available:'® pdf(1 .40 MB) Additional Information: full citation, abstract, re:'erences, ci 

Existing research understates the benefits that can be obtained from inlining i 
profile information. Our implementation of inlining and cloning yields excellen 
lowers performance. We believe our good results can be explained by a numb' 
intermediate-code level removes most technical restrictions on what can be ir 
and incorporate profile information enables ... 

8 Datapath and control for quantum wires 

Nemanja Isailovic, Mark Whitney, Yatish Patel, John Kubiatowicz, Dean Copsey, 
Mark Oskin 

March 2004 ACM Transactions on Architecture and Code Optimization (TACO), 
Full text available:'® pdf(476.63 K8) Additional Information: fuil citation, aDsltaet, reference 

As quantum computing moves closer to reality the need for basic architectura 
Quantum wires, which transport quantum data, will be a fundamental compon 
architectures. Since they cannot consist of a stream of electrons, as in the cla 
fundamentally be designed-differently-In this paperrwe present two quantum 
swapping of adjacent qubits, and a teleportation wire, ... 

Keywords: Architecture, Control, Layout 
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9 Software profiling for hot path prediction: less is more 
Evelyn Duesterwald, Vasanth Bala 

November 2000 Proceedings of the ninth international conference on Architectural 
operating systems, Volume 34 , 28 Issue 5 , 5 

Full text available:"! pdf(286.07 KB) Additional Information: ft.:!! cRalion, abslrsot, rtjfefences, 

Recently, there has been a growing interest in exploiting profile information ir 
compilers, dynamic optimizers and, binary translators. In this paper, we show 
schemes that provide highly accurate information in an offline setting are ill-s 
systems. We experimentally demonstrate that hot path predictions must be m 
cost of missed opportunity tha ... 

""o Advanced design and modeling techniques: Optimal design of high fan-in n 
nonlinear programming 

Hsu-Wei Huang, Cheng-Yeh Wang, Jing-Yang Jou 

January 2004 Proceedings of the 2004 conference on Asia South Pacific design aut 
fair 2004 

Full text available:"! pdf(142.54 KB) Additional Information: fuil citation, absEract, 

In this paper, a novel strategy for designing the heterogeneous-tree multiplex 
delay model by curve fitting and then formulate the heterogeneous-tree multi 
of optimization problem called mixed-integer nonlinear programming (MINLP) 
size in each stage, is introduced to improve the speed of the heterogeneous-ti 
can determine the multiplexer architec ... 

I"" Future technologies: Timing, energy, and thermal performance of three-din 
Shamik Das, Anantha Chandrakasan, Rafael Reif 

April 2004 Proceedins of the 14th ACM Great Lakes symposium on VLSI 

Full text available:H pdf(488.45 KB) Additional Information: full citationrabstractrrefensnc- 

We examine the performance of custom circuits in an emerging technology kr 
By combining multiple device layers with a high-density Inter-layer interconnf 
expected to provide better timmg and energy perfoTmance relative to a single 
circuit. In this paper, we show that by using our performance-driven design tc 
dissipation of standard-cell circuits can ... 

Keywords: 3-D IC, 3-D integration, energy, thermal optimization, timing 
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GADGET: a toolkit for optimization-based approaches to interface and disp 
James Fogarty, Scott E. Hudson 

November 2003 Proceedings of the 16th annual ACM symposium on User interfac 

Full text available:*^ pdf{823.53 KB) Additional Information: fuil citation, abstract, refefences, c 

Recent work is beginning to reveal the potential of numerical optimization as 
and displays. Optimization-based approaches can often allow a mix of indeper 
blended in ways that would be difficult to describe algorithmically. While optir 
offer several potential advantages, further research in this area is hampered t 
paper presents GADGET, an experimental toolk ... 

Keywords: display generation, layout algorithms, numerical optimization, perc 



Server performance and scalability: A smart hill-climbing algorithm for appi 
Bowei Xi, Zhen Liu, Mukund Raghavachari, Cathy H. Xia, LI Zhang 
May 2004 Proceedings of the 13th international conference on World Wide W 
Full text available:® pdf(373.43 KB) Additional Information: (till cftaiion, abstract, teferttric 

The overwhelming success of the Web as a mechanism for facilitating informa 
business transactions has ledto an increase in the deployment of complex ent 
typically run on Web Application Servers, which assume the burden of managi 
memory management, database access, etc., required by these applications. ' 
Server depends heavily on appropriate configuration. Co ... 

Keywords: automatic tuning, gradient method, importance sampling, simulate 



Extending Path Profiling across Loop Backedges and Procedure Boundarie 
Sriraman Tallam, Xiangyu Zhang, Rajiv Gupta 

March 2004 Proceedings of the international symposium on Code generation and • 
runtime optimization _ / _ 

Full text available: 1 p.:if{4 16.54 KB) Additional lnformation:fu!l citatior 

Since their introduction, path profiles have been used toguide the application 
andperforming instruction scheduling. However, for optimizationand schedulin 
frequencycounts of paths that extend across loop iterations and crossprocedur 
referred to asinteresting paths in this paper, account for over 75% of theflow 
Although the frequencycounts of interesting paths can b ... 

Keywords: path profiles, overlapping path profiles, profileguided optimization. 
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"•s Placement techniques: FastPlace: efficient analytical placement using cell : 
a hybrid net model 

Natarajan Viswanathan, Chris Chong-Nuen Chu 

April 2004 Proceedings of the 2004 international symposium on Physi.cal desi 
Full text available: 1 pdf(237.50 KB) Additional Information : full cstatian, abstract, referent 

In this paper, we present FastPlace -- a fast, iterative, flat placement algorithi 
FastPlace is based on the quadratic placement approach. The quadratic appro? 
minimization problem as a convex quadratic program, which can be solved eff 
However It suffers from some drawbacks. First, the resulting placement has a 
resulting total wirelength ... 

Keywords: analytical placement, net models, standard cell placement 

16 Advances in embedded software scheduling techniques: Pareto-optimizatic 
embedded systems 
Peng Yang, Francky Catthoor 

October2003 Proceedings of the 1st lEEE/ACM/IFIP international conference on H 
synthesis 

Full text available:"! pdf(2 13.01 KB) Additional Information: ftjti cftatbn, abstrsct, references, 

Pareto-set-based optimization can be found in several different areas of embe 
task scheduling, where different task mapping and ordering choices for a targi 
performance/cost tradeoffs. To explore this design space at run-time, a fast ai 
have modeled the problem as the well known Multiple Choice Knapsack Proble 
greedy heuristic for the run-time task scheduling. To ... 

Keywords: Pareto optimization, embedded system, low-power, scheduling " 

Code scheduling:~Phi-Pre^Tcatlon-for-light-w^^ 
Weihaw Chuang, Brad Calder, Jeanne Ferrante 

March 2003 Proceedings of the international symposium on Code generation and ( 

runtime optimization 
Full text available:1pdf(i.is ?,'1B) S Pubiisher Site Additional Information: full citation, abstrac' 

Predicated execution can eliminate hard to predict branches and help to enabi 
current predication variants exist where the result update is conditional based 
predicate. However, conditional writing of a register creates a naming problen 
stall the issuing of Instructions. This problem arises from potential multiple pr 
which is unresolved until the prior ... 
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18 A comparative study of static and profile-based heuristics for inlining 
Matthew Arnold, Stephen Fink, Vivek Sarkar, Peter F. Sweeney 

January 2000 ACM SIGPLAN Notices , Proceedings of the ACM SIGPLAN workshop i 

and optimization. Volume 35 Issue 7 
Full text available;"! pdf(1 .13 m) Additional Information: fu!! cftaOon, abstract, references, ci 

In this paper, we present a connparative study of static and pro 
Our motivation for tiiis study is to use the results to design the 
can for the Jalapeno dynamic optimizing compiler for Java [6]. 
approximation algorithm for the KNAPSACK problem as a comm 
“meta-algorithm” for the inlining heuristics studie 
performance results for an implementation of these inlinin ... 

19 Database theory, technology and applications (DTTA): On the semantics ai 
languages for NP search and optimization problems 

E. Zumpano, S. Greco, I. Trubitsyna, P. Veltri 

March 2004 Proceedings of the 2004 ACM symposium on Applied computinc 

Full text available:"! pdf(233.87 KB) Additional Information: full citation, abstract, referenc 

It has been shown that NP (decision, search and optimization) problems can b 
(Patalog with unstratified negation) queries under stable model semantics. Ar 
is often neither simple nor intuitive and, besides, DATALOG does not allow to 
expressive power. This paper analyzes the power of Datalog-like languages in 
problems. In more detail, in t ... 

Keywords: datalog, deductive and logic databases, expressive power of query 
queries _ _ _ _ . 



20 Positional adaptation of processors: application to energy reduction 
Michael C. Huang, Jose Renau, Josep Torrellas 

May 2003 ACM SIGARCH Computer Architecture News , Proceedings of the 30th a 
Computer architecture. Volume 31 Issue 2 

Full text available:"! pdf{225.57 KB) Additional Information: fu!l citstion, abstract, refe 

Although adaptive processors can exploit application variability to improve pei 
managing their adaptivity is challenging. To address this problem, we introdu( 
Positional approach. In this approach, both the testing of configurations and tf 
configurations are associated with particular code sections. This is in contrast 
approach to adaptation ... 
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